Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix for line number calculations #194

Merged
merged 4 commits into from
Apr 1, 2019
Merged

Bug fix for line number calculations #194

merged 4 commits into from
Apr 1, 2019

Conversation

Xenomega
Copy link
Member

This issue aims to resolve #180 . By default, python's open() function parses CR LF (\r\n) line endings as LF (\n). During line number calculation, a byte offset is converted into line numbers by splitting the string with a \n character, and summing up the length of each line until the correct line byte range is found. With the removal of CR control characters, a miscalculation of line lengths would cause a later-than-intended line range to be reported.

This fixes the issue by adding a newline='' argument to the open() functions related to source mapping. The use of a blank new line will allow python's new line related parsing functions to work, while retaining CR LF line endings. Reference: https://stackoverflow.com/questions/20350305/python-read-crlf-text-file-as-is-with-crlf

NOTE: This means that the source_code string property for a source mapping will now retain CR characters within them. This property is not currently used for anything but line number calculations, making it non-problematic. In the future, if code intends to print/operate on these source code properties, it should be mindful that line endings can be CR LF or LF.

@CLAassistant
Copy link

CLAassistant commented Mar 29, 2019

CLA assistant check
All committers have signed the CLA.

@disconnect3d
Copy link
Contributor

For clarity, I believe this:

During line number calculation, a byte offset is converted into line numbers by splitting the string with a \n character, and summing up the length of each line until the correct line byte range is found.

Is implemented in

@staticmethod
def _compute_line(source_code, start, length):
"""
Compute line(s) number from a start/end offset
Not done in an efficient way
"""
total_length = len(source_code)
source_code = source_code.split('\n')
counter = 0
i = 0
lines = []
while counter < total_length:
counter += len(source_code[i]) +1
i = i+1
if counter > start:
lines.append(i)
if counter > start+length:
break
return lines

Btw. this should probably use .splitlines(); also note that there is os.linesep that is cross platform but I doubt it is helpful here.

@Xenomega
Copy link
Member Author

Xenomega commented Mar 30, 2019

@disconnect3d Yes, that was the line calculation code which caused the miscount due to CR LF being parsed as LF line endings. Thanks for the suggestion too, I've updated the code to use splitlines(). Although it splits on some additional characters which IDEs do not consider as new lines, Solidity disallows them as ILLEGAL characters, which should make it safe/efficient to use. That should also fix up another edge case with sole CR endings.

@montyly montyly changed the base branch from master to dev April 1, 2019 10:17
@montyly montyly merged commit e328b65 into dev Apr 1, 2019
@montyly montyly deleted the dev-fix-line-numbers branch April 5, 2019 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants